Live Migration Of Parallel Applications

نویسندگان

  • Raul Fabian Romero
  • Fabian Romero
  • Thomas J. Hacker
  • John A. Springer
  • Eric T. Matson
  • Gary R. Bertoline
چکیده

Romero, Raul F. M.S., Purdue University, August, 2010. Live Migration of Parallel Applications. Major Professor: Thomas J. Hacker. It has been observed on engineering and scientific data centers that the absence of a clear separation between software and hardware can severely affect parallel applications. Applications that run across several nodes tend to be greatly affected because a single computational failure present in one of the nodes often leads the entire application to produce incorrect results or to even die. This low observed reliability requires a combination of a proactive and reactive solution in order to preserve the state of parallel jobs running on degraded nodes; therefore it is possible to avoid runtime errors in parallel applications. This thesis addressed the critical problem of low reliability in parallel jobs by implementing a fault tolerance approach based on OpenVZ virtualization. By using virtual machines on which parallel applications were running, this study showed that it was feasible to make parallel jobs independent of any particular hardware/software implementation; therefore when a degraded node is detected, the virtual machine(s) running on this degraded node(s) may be migrated with its parallel jobs to a healthier node. This study examined the correctness and performance of implementing live migration on hosts loaded with parallel jobs, and determined that it is possible to efficiently save the state of parallel applications after live migration of virtual machines to a more reliable node.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Near Optimal Approach in Choosing The Appropriate Physical Machines for Live Virtual Machines Migration in Cloud Computing

Migration of Virtual Machine (VM) is a critical challenge in cloud computing. The process to move VMs or applications from one Physical Machine (PM) to another is known as VM migration. In VM migration several issues should be considered. One of the major issues in VM migration problem is selecting an appropriate PM as a destination for a migrating VM. To face this issue, several approaches are...

متن کامل

Analytical evaluation of an innovative decision-making algorithm for VM live migration

In order to achieve the virtual machines live migration, the two "pre-copy" and "post-copy" strategies are presented. Each of these strategies, depending on the operating conditions of the machine, may perform better than the other. In this article, a new algorithm is presented that automatically decides how the virtual machine live migration takes place. In this approach, the virtual machine m...

متن کامل

A Versioning Approach to VM Live Migration

In the context of virtual machines live migration, two strategies called “pre-copy” and “post-copy” have already been presented; but each of these strategies works well only in some circumstances. In this paper, we have a brief presentation of QAVNS and then introduce a new approach which is based on the concept of "informational object", assigning QAVNS-scheme-revision number, and observing th...

متن کامل

Proactive process-level live migration and back migration in HPC environments

As the number of nodes in high-performance computing environments keeps increasing, faults are becoming common place. Reactive fault tolerance (FT) often does not scale due to massive I/O requirements and relies on manual job resubmission. This work complements reactive with proactive FT at the process level. Through health monitoring, a subset of node failures can be anticipated when one’s hea...

متن کامل

A version numbering scheme for informational objects used in VM live migration

Various numbering schemes are used to track different versions and revisions of files, software packages, and documents. One major challenge in this regard is the lack of an all-purpose, adaptive, comprehensive and efficient standard. To resolve the challenge, this article presents Quadruple Adaptive Version Numbering Scheme. In the proposed scheme, the version identifier consists of four integ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013